Beta distribution misspecification tests with application to Covid-19 mortality rates in the United States

The beta distribution is routinely used to model variables that assume values in the standard unit interval, (0, 1). Several alternative laws have, nonetheless, been proposed in the literature, such as the Kumaraswamy and simplex distributions. A natural and empirically motivated question is: does the beta law provide an adequate representation for a given dataset? We test the null hypothesis that the beta model is correctly specified against the alternative hypothesis that it does not provide an adequate data fit. Our tests are based on the information matrix equality, which only holds when the model is correctly specified. They are thus sensitive to model misspecification. Simulation evidence shows that the tests perform well, especially when coupled with bootstrap resampling. We model state and county Covid-19 mortality rates in the United States. The misspecification tests indicate that the beta law successfully represents Covid-19 death rates when they are computed using either data from prior to the start of the vaccination campaign or data collected when such a campaign was under way. In the latter case, the beta law is only accepted when the negative impact of vaccination reach on death rates is moderate. The beta model is rejected under data heterogeneity, i.e., when mortality rates are computed using information gathered during both time periods.


Introduction
Several variables of interest assume values in the standard unit interval, (0, 1). This is the case, e.g., of rates, proportions and concentration indices. The beta distribution is commonly used to model such variables. For instance, [1] use the beta law to model the probability of HIV transmission in male-to-female sexual encounters and [2] lists applications of the beta law to engineering. Other applications of the beta distribution can be seen in [3] (gear damage analysis), [4] (relative sunshine duration in Malaysia) and [5] (group-based trajectory modeling of neurological activity of comatose cardiac arrest patients). Additionally, [6] note that "[t]he beta distributions are among the most frequently employed to model theoretical distributions". It is noted that the beta law arises naturally in 'normal theory' since Z 1 /(Z 1 + Z 2 ) is beta distributed if Z 1 and Z 2 are independent chi-squared random variables. The beta distribution can also be obtained as the limiting distribution of eigenvalues ratio in a sequence of random matrices.
Alternative distributions with support in the standard unit interval have been proposed in the literature and have been increasingly used in empirical analyses, such as, e.g., the Kumaraswamy (see [7]) and simplex distributions (see [8]) and more recently, the unit-Weibull (see [9]) and reflected unit Burr XII (see [10]) distributions. It would then be useful to provide practitioners with a test that can be used to determine whether the beta law-which is still the most used model with fractional data-yields an adequate data fit. If not, an alternative model should be considered. This is our chief goal in this paper. In particular, we present tests of the null hypothesis that the beta model is correctly specified against the alternative hypothesis that it is misspecified. Alternative models should be considered for the application at hand whenever the null hypothesis of correct beta model specification is rejected. In particular, we consider a general test of correct model specification that was introduced by [11], known as 'the information matrix test', and also some variants of it. The name of the test stems from the fact that the information matrix equality is known to only hold when the model is correctly specified. Information matrix test statistics are based on the sample counterparts of the model matrices that comprise such an equality. They were derived for several statistical models and distributions, e.g., the Gaussian linear regression model (see [12]), binary data models (see [13]), linear regressions with autoregressive and moving average errors (see [14]), logistic regressions (see [15]), beta-binomial models (see [16]), and the negative binomial law (see [17]).
We obtain three information matrix test statistics for testing the null hypothesis that the beta model is correctly specified. They differ in the estimator used for the covariance matrix of a given random vector. The first two test statistics employ different estimators of the random vector's asymptotic covariance matrix whereas the third and final test statistic employs a resampling-based estimator of its exact covariance matrix. Since our numerical results show that the first two tests are considerably size-distorted in small to moderately large sample sizes, we also perform them using bootstrap critical values. It is noteworthy that the tests we develop are based on the information equality, which only holds when the model specification is not in error. As a consequence, they have power against any form of model misspecification, not only of distributional nature.
The Monte Carlo simulation evidence we report shows that the tests perform well, especially when coupled with bootstrap resampling. As noted above, three variants of the information matrix test are considered. For two of them, bootstrap resampling is used to obtain critical values that do not rely on asymptotic approximations whereas, in the remaining test, bootstrap resampling is used to estimate a covariance matrix that is used in the test statistic. Overall, the use of bootstrap resampling yields good control of the type I error frequency. Simulations in which the data were generated under the alternative hypothesis show that the tests are typically able to detect incorrect model specification, especially when the sample size is not small. Consider, e.g., the Kumaraswamy distribution, which is commonly used as an alternative law for fractional data. The numerical results we report show that when such a law is the true data-generating mechanism, the information matrix tests reject the beta model with probabilities around 0.9 for samples that contain 250 data points at the 10% significance level. Our Monte Carlo evidence also shows that the tests can successfully reject the univariate beta model when the sample size is not very small and the underlying law is beta but with non-constant means.
We model state and county Covid-19 mortality rates in the United States (US) using the univariate beta model. Three sample periods are considered: the first only includes observations from prior to the start of the nationwide vaccination campaign, the second encompasses data obtained before and after such a date, and the third and final period only includes data collected when the vaccination drive was under way. The testing inferences suggest that the beta law yields an adequate data representation for Covid-19 death rates in the first and third time periods. By contrast, the beta law is rejected when the data are heterogeneous, i.e., when the mortality rates are computed using information gathered prior to and during the nationwide vaccination drive. Interestingly, the univariate beta model is found to adequately describe the data in the third time period, in which mortality rates are negatively impacted by the reach of the vaccination drive. This happens because (i) in the initial part of the sample period vaccination was incipient and had little impact on the overall mortality figures and (ii) the negative relationship between the two variables is weakened by a few states, namely: Alaska, Arizona, Florida, Massachusetts, North Dakota, and Rhode Island. When all counties in such states are removed from the data, the inverse relationship between vaccination reach and death rates become considerably more intense, and the information matrix tests reject the adequacy of the univariate beta model, thus indicating that a more elaborate model should be used. The information matrix tests' inferences thus indicate that as long as the negative impact of vaccination reach on death rates is moderate, the beta law can be used to represent Covid-19 mortality rates. When such a negative impact becomes more pronounced, the univariate beta model should no longer be used.
The remainder of the paper is organized as follows. The beta distribution and the corresponding maximum likelihood parameter estimation are briefly presented in the next section.
In the third section, information matrix misspecification tests for the beta model are obtained. In particular, we introduce five tests, three of which based on bootstrap resampling. Monte Carlo simulation results are presented in the fourth section. We evaluate the tests' null (size) and non-null (power) behaviors. An empirical analysis of Covid-19 mortality rates in the US is presented and discussed in the fifth section. Finally, concluding remarks are offered in the sixth section together with directions for future research.

The beta distribution
Let Y be a beta-distributed random variable. Its density function, following the parametrization introduced by [18], can be expressed as f ðy; m; �Þ ¼ Gð�Þ Gðm�ÞGðð1 À mÞ�Þ y m�À 1 ð1 À yÞ ð1À mÞ�À 1 ; 0 < y < 1; 0 < m < 1; � > 0; ð1Þ where IEðYÞ ¼ m and � and ϕ is a precision parameter since, for fixed μ, Var(Y) = μ(1 − μ)/(1 + ϕ) decreases as ϕ increases. We write Y � Bðm; �Þ. Unlike the standard beta parametrization, the parameters in (1) can be directly interpreted in terms of the distribution mean and precision. As we will see in the fifth section, it is useful to compare estimated precisions obtained from different model fits. The beta density in (1) is symmetric if μ = 0.5 and asymmetric otherwise, and it reduces to the uniform density if μ = 0.5 and ϕ = 2. The beta density can be asymmetric to the left or to the right, and it can also be J-shaped, inverted J-shaped, and U-shaped. It is thus clear, as noted by [6], that "[b]eta distributions are very versatile and a variety of uncertainties can be usefully modelled by them." It is noted that "[t]his flexibility encourages its empirical use in a wide range of applications." Let Y 1 , . . ., Y n be independent and identically distributed (i.i.d.) beta-distributed random variables and let y 1 , . . ., y n be their observed, realized values. In what follows, Y and y denote the n-vectors of such random variables and realizations, respectively. Also, θ = (μ, ϕ) > is the vector of beta parameters. Whenever required, we refer to μ and ϕ as θ 1 and θ 2 , respectively.
The log-likelihood function for Y evaluated at y is where ℓ(θ; y t ) = log(f(y t ; μ, ϕ)) is the tth individual log-likelihood, which is given by where ψ is the digamma function, i.e., the first derivative of the logarithm of the gamma function.
The maximum likelihood estimator of θ, sayθ, cannot be expressed in closed-form. Parameter estimates are typically obtained by numerically maximizing the log-likelihood function using a Newton or quasi-Newton nonlinear optimization algorithm. In what follows, we will use Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm with analytical first derivatives for maximum likelihood estimation; for details, see [19].

Beta misspecification tests
Our goal in what follows is to obtain tests of correct model specification for the beta distribution. Our focus is on the information matrix test introduced in full generality by [11]. Let θ 0 = (μ 0 , ϕ 0 ) > be the true parameter value. The beta model is taken to be correctly specified if Y t follows the beta law with parameter vector θ 0 8 t.
In what follows, we will present three information matrix test statistics that can be used to test the correct beta model specification. At the outset, we derive several quantities that are used in such test statistics. We obtain, for the beta model, where A n mm ¼ À � 2 w, A n �m ¼ A n m� ¼ ðY � t À m � Þ À c and A n �� ¼ À ðmcÞ=� À ð1 À mÞc 0 ðð1 À mÞ�Þ þ c 0 ð�Þ. Expressions for c and w can be found, as noted earlier, in the Appendix. Additionally, Notice that A n (θ;Y) and B n (θ;Y) evaluated at θ ¼θ are consistent estimators of A(θ 0 ) and B(θ 0 ), respectively. We also need to obtain is a 3 × 1 vector with lth component given by with i = j = 1 for l = 1; i = 1 and j = 2 for l = 2; i = j = 2 for l = 3. For the beta distribution, we obtain Note that D n (θ;Y) = vech(A n (θ;Y) + B n (θ;Y)) is a vector that contains three elements. The information matrix test statistics we consider are functions of such a restrictions vector evaluated at θ ¼θ. Let ðdðθ; Y t Þ À rDðθÞAðθÞ À 1 r'ðθ; Y t ÞÞ > �; where DðθÞ ¼ IEðdðθ; Y t ÞÞ and rD(θ) = @D(θ)/@θ > . [11] showed that, under correct model specification, ffi ffi ffi n p D n ðθ; YÞ is asymptotically normally distributed with zero mean and covariance matrix V(θ 0 ) and noticed that a natural consistent estimator for V(θ 0 ) is ½ðdðθ; Y t Þ À rD n ðθ; YÞA n ðθ; YÞ À 1 r'ðθ; Y t ÞÞ� ðdðθ; Y t Þ À rD n ðθ; YÞA n ðθ; YÞ À 1 r'ðθ; Y t ÞÞ > � evaluated at θ ¼θ, where rD n (θ;Y) = @D n (θ;Y)/@θ > . Closed-form expressions for the elements of rD n (θ;Y) in the beta model are given in the Appendix. The first information matrix test statistic is where q is the number of components of D n (θ;Y) considered (q � 3). Under H 0 , z 1 is asymptotically distributed as w 2 q . The test is then carried out using critical values from such a distribution, i.e., H 0 is rejected at significance level α 2 (0, 1) if Alternative information matrix test statistics can be obtained by considering different consistent estimators for V(θ 0 ). [20,21] showed that it is possible to use a covariance matrix estimator that does not require third order log-likelihood derivatives. They use the fact that, under H 0 , rDðθ 0 Þ ¼ À IEðdðθ 0 ; Y t Þ � r'ðθ 0 ; Y t Þ > Þ; see [21]. Let The Chesher-Lancaster estimator of V(θ 0 ) is ½ðdðθ; Y t Þ þ L n ðθ; YÞB n ðθ; YÞ À 1 r'ðθ; Y t ÞÞ� ðdðθ; Y t Þ þ L n ðθ; YÞB n ðθ; YÞ À 1 r'ðy; Y t ÞÞ > � evaluated at θ ¼θ. The corresponding information matrix test statistic is z 2 ¼ nD n ðθÞ > ðV n2 ðθÞÞ À 1 D n ðθÞ: Under H 0 , z 2 is asymptotically distributed as w 2 q and, as before, the test is carried out using asymptotic critical values.
It is noteworthy that V n1 ðθÞ and V n2 ðθÞ are consistent estimators of V(θ 0 ), the latter being the asymptotic covariance matrix of ffi ffi ffi n p D n ðθ; YÞ. A consistent estimator of the exact covariance matrix of such a vector, say V S n ðθ 0 Þ, can be obtained by using parametric bootstrap resampling, as shown by [22]. The bootstrap estimator of V S n ðθ 0 Þ based on B bootstrap samples can be computed as follows: 1. Using the original sample Y = (Y 1 , . . ., Y n ) > , computeθ.

Using the bootstrap replicates
For fixed n and as B ! 1, it follows thatV � n3;B ! p V S n ðθÞ; see [22]. We thus arrive at a third information matrix test statistic for testing the correct beta model specification. It is given by Under H 0 , for fixed B and n ! 1, z 3 is asymptotically distributed as T 2 q;BÀ 1 , i.e., as Hotelling's T-squared distribution with q and B − 1 degrees of freedom; see [22]. As before, the test is performed using asymptotic critical values.
The information matrix test statistics z 1 , z 2 and z 3 measure the sample evidence against the correct beta model specification. When they assume large values and H 0 is rejected at the usual significance levels, an alternative model should be used. A word of caution, however, is in order. The test based on z 3 is expected to perform well in small to moderately large samples since the test statistic uses a bootstrap estimator of the exact covariance matrix of ffi ffi ffi n p D n ðθ; YÞ. The tests based on z 1 and z 2 , by contrast, may be considerably size-distorted when n is not large since the test statistics use estimators of the asymptotic covariance matrix of ffi ffi ffi n p D n ðθ; YÞ and such an asymptotic covariance matrix may be a poor approximation for its exact counterpart when n is not large. To remedy that, we recommend that z 1 and z 2 testing inferences be based on critical values obtained from bootstrap resampling instead of on w 2 q;1À a (asymptotic critical values). To that end, for i = 1, 2: 1. Using the original sample Y = (Y 1 , . . ., Y n ) > , computeθ and z i .

Obtain a random sample of size
The use of bootstrap resampling when performing testing inferences based on the information matrix test statistics z 1 and z 2 may considerably reduce size distortions since the critical values used in such tests are now obtained from estimates of the test statistics' exact null distributions.
As noted earlier, it is possible to test q � 3 restrictions. In what follows, we will test two restrictions since numerical evaluations not shown here for brevity revealed that the third element of D n ðθ; YÞ always assumes very small values and has very small variance, especially when dispersion is low, which renders near singular estimates of V(θ 0 ). As noted by [11], when an indicator is identically null it should be ignored; see the example on page 10 of his article. Unlike what happens in his example, the maximum likelihood estimators in our case cannot be expressed in closed form, and that is why we had to resort to numerical evaluations to determine whether there is a non-relevant restriction. We thus test q = 2 restrictions by using d(θ; Y t ) = (d 1 (θ; Y t ), d 2 (θ; Y t ))) > . Correspondingly, we drop the last row of rD n (θ;Y). The asymptotic null distribution of z 1 and z 2 is w 2 2 , and that of z 3 is T 2 2;BÀ 1 , where B is the number of bootstrap replications used in the estimation of V S n ðθ 0 Þ.
According to [11], it is expected that the tests will be consistent (i.e., have unit power asymptotically) against any alternative which renders the usual maximum likelihood inference techniques invalid. In our case, maximum likelihood inference involves the estimation of the beta distribution mean and precision parameters. When Y follows other laws or when the values of the beta parameters are not the same for all observations, the test statistics are expected to diverge in probability so that unit power is achieved asymptotically. We performed Monte Carlo simulations using a number of alternative models as the true data generating mechanism, which include alternative laws, data inflation, and neglected regression structure. The results from these simulations are presented in the next section. They show evidence of asymptotic unit power under all sources of model misspecification we considered.

Numerical evidence
We will now numerically evaluate the performance of the information matrix tests when used to determine whether the beta distribution yields a satisfactory data fit, i.e., when used to determine whether the beta model is correctly specified. Data generation is carried out under the null and alternative hypotheses (correct and incorrect model specification, respectively). Beta random number generation is performed using the acceptance-rejection method based on uniform random draws obtained using the Mersenne Twister method. Parameter estimates are obtained by numerically maximizing the beta log-likelihood function using the BFGS quasi-Newton algorithm with analytical first derivatives. The starting values used in the estimation of μ and ϕ are, respectively, � y and � yð1 À � yÞ= d VarðYÞ À 1, where � y ¼ n À 1 P n t¼1 y t and d VarðYÞ ¼ ðn À 1Þ À 1 P n t¼1 ðy t À � yÞ 2 . The number of Monte Carlo and bootstrap replications are, respectively, 5000 and 500. The null hypothesis is H 0 : "the beta model is correctly specified" and the alternative hypothesis is H 1 : "the beta model is misspecified". The following tests are performed: z 1 , z 1B , z 2 , z 2B , and z 3 . The z 1B and z 2B tests employ bootstrap critical values, and the z 3 test statistic uses a bootstrap covariance matrix estimate. The simulations were performed using the R statistical computing environment; see [23].
In what follows, we will report the tests' null and non-null rejection rates obtained from size (data generated under H 0 ) and power (data generated under H 1 ) simulations, respectively. Additionally, we will present p-value plots and size-power plots for the z 1 , z 2 and z 3 tests, i.e., for the tests that do not employ bootstrap critical values. Based on the size simulations (the data-generating process is beta), we plot the tests' empirical sizes (vertical axis) against nominal sizes, i.e., against values of α 2 (0, 1) (horizontal axis). The 45˚line indicates perfect agreement between actual and nominal sizes. Curves that lie above (below) such a diagonal line for a given range of values of α are indicative of liberal (conservative) behavior at those significance levels. It should be noted that, in this graphical analysis, α is not fixed at three values (0.10, 0.05 and 0.01) but varies from close to zero up to close to one. We thus obtain a comprehensive view of the tests' null behaviors. We also present plots that relate the tests' empirical powers (vertical axis) to the corresponding sizes (horizontal axis), computed for values of α ranging from close to zero up to close to one. The non-null rejection rates are computed using a data-generating process that differs from the beta law. It should be noted that since the nonnull rejection rates are plotted using the empirical critical value for each nominal size (and not using asymptotic critical values) it is possible to compare the tests' non-null behaviors by properly accounting for any existing size distortions. The higher the curve, the more powerful the test. For more details on these plots, see [24].
In the first size simulation, the data are generated from the beta law with μ = 0.2 and ϕ = 20, 40, 80, 120. The null rejection rates of the z 1 , z 1B , z 2 , z 2B and z 3 tests are shown in Table 1. All entries are percentages. The reported results lead to interesting conclusions. First, the z 1 and z 2 tests, which use asymptotic critical values, are quite liberal when the sample size is not very large; even with n = 1000, considerable size distortions take place. Second, such tests have effective sizes that are close to the nominal sizes when bootstrap (rather than asymptotic) critical values are used. For example, when ϕ = 40 and n = 100, the sizes of z 1 and z 2 , at α = 10%, are 17.6% and 37.4%; when bootstrap critical values are used, these rates drop to 10.7% and 9.8%, respectively. The use of bootstrap resampling thus considerably reduces size distortions. Third, the size distortions of z 1 decrease when the value of ϕ increases. For example, the test's null rejection rates for n = 100 and α = 10% are 20.1% and 12.3% when ϕ = 20 and ϕ = 120, respectively. It is worth noticing that the variance of Y decreases when the value of ϕ increases, and that translates into more accurate testing inferences. Fourth, the z 3 test tends to be conservative when n � 1000, and displays null rejection rates close to the nominal levels with n = 5000.
In the second set of size simulations, data generation was performed from the beta distribution with μ = 0.5 and the same precision values as before. The tests' null rejection rates are presented in Table 2. All entries are percentages. In general, the new results are similar to those in the previous scenario. The z 1 and z 2 tests remain liberal, with z 1 exhibiting considerably higher null rejection rates relative to previous results. For example, when ϕ = 40, α = 10% and n = 100, the null rejection rate of z 1 is 28.4% whereas in the previous scenario it was 17.6%. The testing inferences are less accurate here because there exists more uncertainty since the variance of the beta distribution is maximal when μ = 0.5; recall that such a variance is μ(1− μ)/ (1 + ϕ). The figures in Table 2 further show that the z 1B and z 2B tests display the smallest size distortions, being accurate even when n is small. For example, when ϕ = 20 and n = 50, the sizes of z 1B and z 2B , at α = 10%, are 10.0% and 9.6%, respectively. It is thus clear that bootstrap resampling works remarkably well. Additionally, the z 3 test remains conservative when μ = 0.5, but only for α = 10% and 5%. The test exhibits small size distortions when n � 250. For instance, when ϕ = 20 and n = 250, the test's null rejection rate, at α = 10%, is 9.3%.
The third and final set size simulations was performed using μ = 0.75 with the same precision values as before. We used μ = 0.75 (and not μ = 0.8) to avoid symmetry relative to the first scenario. The null rejection rates, expressed as percentages, are presented in Table 3. Overall, the results in this scenario are similar to those in Table 1 (μ = 0.2). The z 1 and z 2 tests are liberal when n � 1000 and only become accurate with n = 5000. The z 1B and z 2B tests have the smallest size distortions. Such tests deliver accurate inferences even when n is small. For example, when n = 50, ϕ = 40 and α = 10%, their null rejection rates are 10.1% and 9.7%, respectively. It should also be noted that the z 3 test exhibits conservative behavior when n � 500. For example, with n = 500, ϕ = 20 and α = 10%, its null rejection rate is 8.7%.
The results presented above show that, in general, the z 1 test exhibits less liberal behavior when the mean of the distribution is not in the middle of the standard unit interval. For instance, when ϕ = 120, n = 250 and α = 10%, the test's null rejection rates for μ = 0.2, 0.5, 0.75 are 12.7%, 22.1% and 14.5%, respectively. Recall that the beta density is symmetric if μ = 0.5 and asymmetric otherwise. It seems that the z 1 test incorrectly finds increasing evidence against the beta model as the distribution becomes more symmetric. The results also show that the z 2 test is quite liberal in all scenarios, especially when the sample size is small. Finally, the z 3 test becomes more conservative as the distribution mean moves away from 0.5. For Table 1. Null rejection rates (%), μ = 0.2.

PLOS ONE
Beta misspecification tests and Covid-19 mortality rates in the U.S.

PLOS ONE
Beta misspecification tests and Covid-19 mortality rates in the U.S.

PLOS ONE
Beta misspecification tests and Covid-19 mortality rates in the U.S.
example, when ϕ = 80, n = 500 and α = 10%, the test's null rejection rates for μ = 0.2, 0.5, 0.75 are 7.0%, 9.5% and 7.2%, respectively. Fig 1 contains p-value plots for the z 1 , z 2 and z 3 tests corresponding to different values of μ. The sample sizes are n = 100, 250 and ϕ = 120. The three curves move closer to the diagonal line when the sample increases from n = 100 to n = 250, thus indicating that the tests' size distortions for all nominal sizes decrease as n increases. It is also clear that z 1 and z 2 are liberal and z 3 is conservative regardless of the value of α, z 1 being less size-distorted than z 2 , especially when the underlying beta law is asymmetric (μ 6 ¼ 0.5). Interestingly, for all values of α, under distributional asymmetry (symmetry), z 1 (z 3 ) is the most accurate test.
We will now shift the focus to the tests' powers, i.e., to their ability of correctly identifying that the null hypothesis is false. In these simulations, the true data-generating process is not the standard beta law, i.e., it is not the beta distribution with constant parameters. Since the z 1 and z 2 tests are oftentimes considerably size-distorted, they are carried out using exact (not asymptotic) critical values obtained from the size simulations. The significance levels are α = 10%, 5%.
At the outset, we use the Kumaraswamy law (see [7]), KWðo; �Þ, as the true data-generating mechanism. Here, ω is the distribution median and ϕ is a precision parameter. The parameter values are (i) ω = 0.2 and ϕ = 5, 7.5, (ii) ω = 0.5 and ϕ = 10, 15, and (iii) ω = 0.75 and ϕ = 15, 25. The tests' non-null rejection rates are presented in Table 4. All entries are percentages. The figures in this table show that the tests' powers are similar for n � 100, being close to 100% when n � 250. When n = 50, the z 3 test is generally the most powerful test. The reported results also show that the tests' powers increase with ϕ. That is, higher precision translates into more powerful tests. Also, when ω = 0.2, the z 3 test exhibits slightly higher powers than the z 1 and z 1B tests, and these in turn exhibit noticeably higher powers than z 2 and z 2B . For illustration, with ω = 0.2, ϕ = 7.5, n = 100 and α = 5%, the non-null rejection rates of the z 1 , z 1B , z 2 ,

PLOS ONE
Beta misspecification tests and Covid-19 mortality rates in the U.S. z 2B and z 3 tests are 61.3%, 65.2%, 56.7%, 56.8% and 69.7%, respectively. Here, z 3 is the best performer. It is also noteworthy that z 3 is the most powerful test when ω = 0.5 for all values of ϕ and α. Additionally, it is seen that the z 2 and z 2B tests are more powerful than the z 1 and z 1B tests. Finally, when ω = 0.75, for all values of α and ϕ, the z 1 , z 1B and z 3 tests display similar powers, which are considerably higher than those of z 2 and z 2B . Fig 2 contains size-power plots for z 1 , z 2 and z 3 . The sample size is n = 100 and the empirical powers were computed using KWðo; �Þ data-generating processes. The tests' powers are very similar for empirical sizes in excess of 0.4. For empirical sizes up to 0.4, z 3 is the clear winner, especially in the left and middle panels; in the right panel, the curves relative to z 1 and z 3 nearly coincide, both clearly lying above that of z 2 . Also, z 1 is the worst performer when the distribution median lies at the center of the standard unit interval, i.e., ω = 0.5; see panel (b).
In the second scenario of power simulations, all samples are randomly generated from the unit Weibull law (see [9]), UWðo; �Þ, where ω is the distribution median and ϕ is a precision parameter. For brevity, we only report results obtained using ω = 0.2, 0.5, 0.75 and ϕ = 5. The tests' non-null rejection rates are given in Table 5. All entries are percentages. It is worth noticing that all empirical powers are nearly equal to 100% when n = 250. When ω = 0.2, z 1 is the best performer. The z 3 test is slightly more powerful than the other tests when ω = 0.5 and 0.75.
Size-power plots are presented in Fig 3. The sample size is n = 100 and the tests' empirical powers were computed using unit Weibull data-generating mechanisms. In Fig 3 panel (a), the size-power curves of the z 1 and z 3 tests are clearly above that of the z 2 test for empirical sizes up to approximately 40%. In panel (b) of Fig 3, for empirical sizes up to about 50%, the curve of the z 3 test is above the curve of the z 2 test, which in turn is above that of the z 1 test. Finally, panel (c) of Fig 3 clearly favors z 3 for empirical sizes up to approximately 50%.
The next set of power simulation results was obtained using simplex (see [8]) data-generating mechanisms: all samples are randomly generated from Sðm; sÞ, where μ is the distribution mean and σ is the dispersion parameter. For brevity, we only present results for μ = 0.75 and σ = 2. The tests' non-null rejection rates, expressed as percentages, can be found in Table 6. It is noteworthy that the powers of the z 1 , z 1B , z 2 and z 2B tests are quite high for n � 250. Also, the z 3 test is clearly less powerful than the competing tests. For example, when n = 100 and α = 10%, the powers of the z 1 , z 1B , z 2 and z 2B tests exceed 60% whereas that of the z 3 test is approximately equal to 22%.
We present size-power plots constructed using the tests' empirical powers under simplex laws in Fig 4. The sample size is n = 100. It can be seen that the curves relative to the z 1 and z 2

PLOS ONE
Beta misspecification tests and Covid-19 mortality rates in the U.S. tests are similar. They both lie considerably above that of the z 3 test for effective sizes up to 40%.
Next, we consider the case in which the data are generated from the beta law but with a regression structure for the mean. That is, we use the beta regression model introduced by [18] as the true model. Here, log(μ t /(1− μ t )) = β 1 + β 2 x t2 . The true parameter values are β 1 = −0.25, β 2 = 0.5 and ϕ = 120. The covariate values were generated from LN(0, 0.5), i.e., as realizations from the log-normal distribution with parameters 0 and 0.5. Table 7 contains the tests' nonnull rejection rates, all expressed as percentages. In general, all tests have high powers when the sample size is not very small. In particular, for n = 250 and α = 10%, the tests have powers close to or equal to 100%. When n = 50, z 1 is clearly less powerful than z 2 and z 3 .
In Fig 5, we present the size-power plot of z 1 , z 2 and z 3 for n = 50. In general, the tests have similar powers when the effective size is smaller than 20% or larger than 60%. In the middle region of the graph, z 3 is the most powerful test.
We also performed simulations using the inflated beta distribution introduced by [25] as the true model. It combines continuous and discrete components, and is used when Y assumes Table 5. Non-null rejection rates (%), data generated from UWðω; ϕÞ.     values in [0, 1), (0, 1] or [0, 1] (inflation at zero, inflation at one, and double inflation, respectively). A common practice is to fit the standard beta distribution after replacing the inflated data points by [Y t (n−1) + 0.5]/n; see [26]. We consider inflation at zero with Pr(Y = 0) = λ.
After the data were generated, all inflated values (zeros) were replaced by 0.5/n, and then the standard beta law was fitted. The null hypothesis is false since the beta model is not the true data generating process. We wish to evaluate the information matrix tests' ability to detect that the beta model is misspecified. Data generation was carried out using μ = 0.5, ϕ = 20 and λ = 0.025. We will not present the simulation results for brevity, but we note that the information matrix tests proved to be very powerful in this setting with non-null rejection rates close to 100% at α = 5% for n = 100. Overall, the results presented above favor the z 1B , z 2B and z 3 tests. The z 1 and z 2 tests typically display very large size distortions and their use should be avoided except when n is large.   Regarding the z 1B , z 2B and z 3 tests, we note that the latter may be considerably conservative for some beta law parameter values. As a result, we recommend the use of the z 1B and z 2B tests in empirical analyses. Such tests showed good control of the type I error frequency and also good power in situations in which the data-generating process is not beta, in particular when n � 250. It is also possible to test the null hypothesis that the variable of interest is beta-distributed using two alternative tests, namely: Anderson-Darling (AD) and Cramér-von Mises (CVM). They are usually carried with the modification proposed by [27], which accounts for unknown parameters in the distribution under test (in our case, beta). We performed Monte Carlo simulations to assess the finite sample behaviors of such tests using the configurations previously described. We do not present such results for brevity. We note, however, that both tests are conservative, i.e., their null rejection rates are smaller than the significance levels. For instance, when n = 100 (n = 500) the AD and CVM null rejection rates at the 10% significance level are, respectively, 7.6% and 6.3% (9.2% and 8.5%). Also, such non-parametric tests are substantially Table 7. Non-null rejection rates (%), data generated from the beta distribution with a mean regression structure. less powerful than the information matrix tests introduced in this paper. For instance, when the true data-generating process is UWð0:5; 7:5Þ (KWð0:5; 15Þ), the AD and CVM non-null rejection rates at α = 10% are, respectively, 29.1% and 42.0% (28.9% and 39.3%) when n = 5000.

Covid-19 mortality rates in the US
We will now present and discuss an analysis of Covid-19 mortality rates in the US. We use the three information matrix tests to determine whether the standard beta model provides an adequate representation of the data. We will also briefly comment on inferences drawn from the AD and CVM tests. Maximization of the beta log-likelihood function was performed using the BFGS method with analytical first derivatives. We used B = 1000 bootstrap replications for performing the z 1B , z 2B and z 3 information matrix tests. In what follows, we model state and county level data for three time periods. In each case, we will report the information matrix tests' p-values, the point estimates of the beta parameters and their standard errors. We report clustered standard errors computed using information on each state's region and on to each county's state. The Covid-19 epidemic began in late 2019. It is estimated that approximately 247 million people had been infected with the new coronavirus by October 2021. The United States was the first country in the Americas to face a serious public health crisis brought on by the new coronavirus. In December 2020, on the 14th to be exact, the US government began a campaign to vaccinate healthcare workers and followed by vaccinating the general population. Covid-19 death rates started to decrease as vaccination progressed.
Our variable of interest are Covid-19 mortality rates per one hundred people. At the outset, we will work with statewide data, i.e., we use data on the 50 US states (n = 50). The death rates were computed using the cumulative number of deaths between January 22 and December 14 of 2020. We refer to this period as 'period 1'. The source of the data on Covid-19 deaths is the Centers for Disease Control and Prevention (https://data.cdc.gov/). Data on state populations in 2020 were obtained from [10]. Since the sample size is small, we only consider bootstrapbased information matrix testing inferences. We wish to determine whether the univariate beta model provides an adequate representation of the data. The model has a simple structure and is based on the assumption that the observations are i.i.d. Can it provide an acceptable and useful representation of the US Covid-19 mortality rates?
The minimum, mean, median, and maximum mortality rates, and the standard deviation are 0.0164, 0.0903, 0.0894, 0.2001 and 0.0423, respectively. The maximal value corresponds to New Jersey. The maximum likelihood estimates of the beta parameters (clustered standard errors in parentheses) arem ¼ 0:0900 (0.0099) and� ¼ 39:8208 (15.5240). The p-values of the z 1B , z 2B and z 3 tests of correct beta specification are 0.2870, 0.6070 and 0.4472, respectively. The model is not rejected at the usual significance levels. We thus conclude that it adequately represents the US state mortality rates. In Fig 6 we present the histogram of the mortality rates together with the beta density evaluated at the maximum likelihood estimates. The estimated density clearly provides a good approximation to the data histogram.
The previous analysis was performed using mortality rates computed up to December 14, 2020. Next, we will conduct a similar analysis, but based on more recent data. We consider state mortality rates calculated using data from January 22, 2020 to October 31, 2021. We refer to this more extended time period as 'period 2'. The minimum, mean, median, maximum and standard deviation values are 0.0550, 0.2120, 0.2227, 0.3370 and 0.0709, respectively. The maximum likelihood point estimates arem ¼ 0:2112 (0.0199) and� ¼ 27:8235 (8.9936). The estimated precision is now approximately 30% smaller than in the previous scenario. The p-values of the z 1B , z 2B and z 3 tests are 0.0600, 0.0570 and 0.0269, respectively. All tests reject the correct specification of the univariate beta model at the 10% significance level; z 3 rejects H 0 at 5%. Fig 7 presents the data histogram and the estimated beta density. The estimated beta density does not adequately represent the data asymmetry.
Unlike the previous results, all tests now reject the beta distribution at α = 10%. The data now cover two very different periods, namely: before and after the start of the nationwide vaccination campaign. There is thus clear data heterogeneity. The much smaller estimated precision (approx. 28 vs approx. 40) is probably due to such heterogeneity.
The mortality rates in the two periods show high positive correlation (0.8252), as expected, given the cumulative nature of the observations. The univariate beta model is not rejected by the information matrix tests when the shorter time period is used. It thus provides a good description of the statewide Covid-19 mortality rates. The second time period, however, covers the Covid-19 vaccination campaign. Since the reach and impact of such a campaign was uneven across the 50 states, for reasons that include partisan political connotations and other Recall that much lower precision was obtained when the longest time period was considered (approx. 28). The z 1B , z 2B and z 3 p-values are, respectively, 0.5900, 0.2860 and 0.5087. These large p-values indicate that there is very little evidence against the beta law. We thus conclude that despite the impact of vaccination on Covid-19 mortality, the univariate beta model still provides a good representation of the data. The data histogram and the fitted beta density are presented in Fig 8. Visual inspection of such a figure suggests that the beta law yields a reasonably good data fit. Interestingly, there is less skewness than in the previous two cases.
The three fitted beta densities are presented in Fig 9. Notice that the estimated densities for periods 1 and 3 are similarly shaped and with somewhat similar precisions. By contrast, the fitted beta density obtained using data that cover both the period in which there was no vaccination and that of the vaccination drive is much more disperse. As noted earlier, heterogeneity in the data leads to poor data fit. The information matrix tests indicated that the beta model yields an adequate data representation in periods 1 and 3, but in for period 2. It seems that the tests correctly detected that the heterogeneous nature of the data renders the beta law unable to adequately represent Covid-19 mortality rates.
We presented above an analysis of statewide Covid-19 mortality data in the US. The inferences obtained from the information matrix tests were quite informative. Such tests indicated that the beta law is able to adequately represent the data in two disjoint periods-before and after the start of the nationwide vaccination campaign -, but not when the two periods are combined. In what follows we will use death rates per 100 persons computed for US counties for periods 1, 2 and 3. The data on the cumulative total of deaths was obtained from the New York Times repository (https://github.com/nytimes/covid-19-data). In order to avoid inaccurate records, we only considered, in each time period, counties with at least one Covid-19 death and at least 15000 inhabitants. The sample sizes for periods 1, 2 and 3 are n = 2073, n = 2080 and n = 2080 respectively. Since the sample sizes are large, we will use all tests, i.e., z 1 , z 1B , z 2 , z 2B and z 3 . Mortality rates were calculated using the estimated populations in 2020 obtained from the Economic Research Service of the US Department of Agriculture (https://www.ers. usda.gov).
Initially, we will consider period 1. The minimum, mean, median, maximum and standard deviation values of the mortality rates are 0.0013, 0.0883, 0.0764, 0.4554 and 0.0596, respectively. The maximum likelihood estimates arem ¼ 0:0884 (0.0055) and� ¼ 22:1618 (1.9529). The z 1 , z 1B , z 2 , z 2B and z 3 tests' p-values are 0.0978, 0.1370, 0.0927, 0.1540 and 0.1022, respectively. No test rejects the beta law at α = 5%. The tests that use bootstrap resampling also do not reject such a hypothesis at α = 10%. The p-values of the tests that use asymptotic critical values are slightly smaller than 0.10. Overall, we conclude that Covid-19 mortality rates can be adequately represented by the beta law in period 1.
Next, we will perform inferences with data from period 3. The minimum, mean, median, maximum and standard deviation of the mortality rates are 0.0085, 0.1632, 0.1528, 0.4734 and 0.0786 respectively. The point estimates arem ¼ 0:1631 (0.0083) and� ¼ 20:4761 (1.5648). The p-values of the z 1 , z 1B , z 2 , z 2B , and z 3 tests are 0.0529, 0.0830, 0.0360, 0.0750, and 0.0903, respectively. Except for z 2 , no test rejects the beta law at α = 5%. We thus conclude that it can be used to adequately represent county-level Covid-19 mortality rates in the third and final period. We will return to these results later. Notice that there is considerably more uncertainty when data from period 2 are used.
Interestingly, similar testing inferences were obtained with state and county data, namely: (i) the univariate beta model provides an adequate description of Covid-19 mortality rates with data either from prior to the nationwide vaccination drive or from when such a drive was under way; (ii) there is evidence against the correct specification of the beta model when Covid-19 mortality rates are computed using data that cover both periods (no vaccination and nationwide vaccination). The tests thus indicate that the beta distribution is not an adequate model for Covid-19 mortality rates under substantial data heterogeneity.
As noted earlier, we also performed the AD and CVM tests using both state and county data. The corresponding p-values for state data are: 0.4691 and 0.8734, period 1; 0.2277 and 0.4339, period 2; 0.9413 and 0.3360, period 3. With county data, we obtained the following pvalues: 0.7414 and 0.5299, period 1; 0.3250 and 0.4856, period 2; 0.8765 and 0.8010, period 3. All p-values are quite large, and hence the beta model is accepted in all scenarios, i.e., for the three time periods and when state or county data are used. In particular, unlike the information matrix tests, the two non-parametric tests are not able to reject the beta model when there is marked data heterogeneity (period 2). By contrast, our tests indicate that the univariate beta model is only appropriate when there is reasonable homogeneity in the data (periods 1 and 3).
We will now further examine (i) the data heterogeneity that caused the rejection of beta law in period 2 and (ii) the acceptance of the beta law in period 3 when the vaccination drive was under way. As noted earlier, the Covid-19 mortality rates computed for period 2 cover two quite distinct periods: January 22, 2020 through December 14, 2020 (period 1) and December 15, 2020 through October 21, 2021 (period 3). (Recall that period 2 consists of the merging of periods 1 and 3.) The correlation coefficient between statewide death rates in periods 1 and 3 is weak: 0.3735. This small correlation strength is indicative that the mortality rates in such periods obey different dynamics. This was expected because, unlike what took place in period 3, there was no nationwide vaccination drive in period 1. Additionally, the states with the lowest mortality rates in period 1 (period 3) are Vermont, Hawaii, Maine, Oregon, and Utah (Vermont, Hawaii, New York, Alaska, and Maine) whereas those with the highest death rates in period 1 (period 3) are New Jersey, Massachusetts, Mississippi, Rhode Island, and North Dakota (Arizona, Alabama, West Virginia, Florida, and Georgia). Consider, e.g., New Jersey and Massachusetts. They are the states with the highest Covid-19 mortality rates in period 1, and yet their corresponding ranks in period 3 are 28 and 32. Arizona and Alabama display the highest death rates in period 3, and yet their ranks in period 1 are 14 and 9, respectively. Again, it is clear that the death rates in periods 1 and 3 (which are combined in period 2) are considerably heterogeneous.
Next, we will examine again Covid-19 mortality rates in period 3; in particular, we will examine the finding that the univariate beta model yields an adequate representation for such rates. There was a nationwide vaccination drive under way in period 3, and its reach negatively impacted death rates. We obtained data on the total number of fully vaccinated people by October 31, 2021. The source of the data is the Our World in Data repository (https:// ourworldindata.org/us-states-vaccinations). The correlation between death and vaccination rates in period 3 is −0.5858 (state data). A natural question is: Given that mortality rates are negatively impacted by vaccination rates, why was the univariate beta model found to be correctly specified? Why use a fixed mean model if the distribution mean appears to be impacted by an explanatory variable (vaccination rate)? At the outset, we note that some states considerably weaken the inverse relationship between the two variables in period 3, namely: Alaska, Arizona, Florida, Massachusetts, North Dakota, and Rhode Island. In particular, the Arizona, Florida, Massachusetts, and Rhode Island (Alaska and North Dakota) Covid-19 mortality rates are higher (lower) than expected based on the corresponding vaccination levels. The inverse correlation between death and vaccination rates becomes considerably stronger when computed without such states: −0.7592 (state data). We removed from the data all counties of the six states that weaken the impact of vaccination reach on death rates, and performed the tests again. The z 1 , z 1B , z 2 , z 2B , and z 3 p-values become 0.0289, 0.0600, 0.0162, 0.0530, and 0.0460, respectively. The z 1 , z 2 and z 3 tests now reject the univariate beta model at α = 5% whereas the z 1B and z 2B p-values are only marginally larger than 0.05. Hence, there is now evidence against the model. Overall, the information matrix tests' inferences suggest that, as long as the negative impact of vaccination reach on death rates is moderate (complete data), the beta law can be adequately used to represent Covid-19 mortality rates. When such a negative impact becomes more pronounced (incomplete data, counties of six states removed from the data), the univariate beta model no longer should be used. In that case, practitioners should search for a more elaborate model. By contrast, the two non-parametric tests continue to accept the univariate beta model even when the Alaska, Arizona, Florida, Massachusetts, North Dakota, and Rhode Island counties are not considered; the AD and CVM p-values are 0.3025 and 0.5788, respectively.
Finally, using the three county data samples, we compare the data fits yielded by the beta distribution to those obtained with the following alternative laws: Kumaraswamy, simplex, and unit Weibull. To that end, we computed, for each sample period and for each distribution, the values of the following information criteria: Akaike Information Criterion (AIC), Corrected Akaike Information Criterion (AICc), Bayesian Information Criterion (BIC), Hannan-Quinn Information Criterion (HQIC), Weighted-Average Information Criterion (WIC) and Empirical Information Criterion (EIC). The latter employs bootstrap resampling and proved to be very effective in dynamic beta modeling; see [28]. We used 1000 bootstrap replications, i.e., 1000 pseudo-samples were generated for computing the EIC values. We also computed the AD and CVM statistics. For all measures, smaller values indicate better data fits. The results are presented in Table 8. They show that, according to all information criteria (AIC, AICc, BIC, HQIC, WIC, and EIC), the best data fits in the three sample periods are yielded by the beta law. Considering the two non-parametric test statistics, in period 1 (period 2) [period 3], the beta model was the winner according to both of them (the runner-up according to both statistics, slightly behind the Kumaraswamy law) [the winner according to CVM and the runner-up according to AD, behind the Kumaraswamy model]. Considering the eight measures and the three sample periods, the beta law was the winner in 21 out of the 24 cases. Fig 11 contains the data histogram and the estimated beta density for period 3, as in Fig 8, together with the fitted Kumaraswamy (KW), simplex and unit Weibull (UW) densities. Visual inspection of the figure shows that the beta law best fits the data histogram. In order to further examine the two best data fits, we produced quantile-quantile (QQ) plots for the beta and Kumaraswamy laws, again using data from period 3. In both panels of Fig 12, empirical quantiles are plotted against theoretical quantiles, the 45˚degree line indicating perfect agreement between both sets of quantiles. The Kumaraswamy and beta laws fit the data quite well up to approximately 0.35 and 0.45, respectively. It is then clear that the latter outperforms the former in the sense that it yields better agreement between empirical and theoretical quantiles.

Concluding remarks
The beta distribution is commonly used to model variables that assume values in the standard unit interval. We developed information matrix tests that can be used to test whether the univariate beta model yields an adequate representation of the data. The null hypothesis of correct model specification is tested against the alternative hypothesis that the model specification is in error. The tests seek to verify whether the information matrix equality holds. As is well known, this equality only holds when the model is correctly specified. The tests' small sample behavior can be improved by using data resampling (bootstrap). We presented the results of extensive Monte Carlo simulations that showed that the tests have good power against different forms of model misspecification, including the case in which the univariate beta model is fitted using data that have an underlying regression structure. We presented an empirical analysis of Covid-19 mortality rates in the US. We considered three sample periods: (i) before, (ii) before and after, and (iii) after the beginning of the nationwide vaccination drive. The testing inferences indicated that the beta law yields a good representation of the data in the pre-vaccination period. There is also evidence in favor of such a model when mortality rates are computed using data that only cover the vaccination drive period as long as the negative impact of vaccination reach on death rates is moderate; when such an impact is strong, the univariate beta model is rejected. The beta law is also rejected by the information matrix tests when mortality rates are computed using data that cover both periods (before and after the start of the vaccination campaign). The rejection of the beta distribution in this case is due to data heterogeneity. Our results should be viewed as an initial exploration on the usefulness of information matrix tests for fractional data analysis. The tests we presented proved to be quite useful when applied to the univariate beta model. In future research, we will extend the results presented in this paper to cover other univariate laws that are used to model fractional data (e.g., Kumaraswamy and simplex). We will also seek to extend our results to regression settings, in particular to the beta regression model introduced by [18], and to dynamic beta models, such as the βARMA model introduced by [29,30]; see also [28,31]. The beta parameterization used in this paper, which is indexed by mean and precision parameters, will be helpful for the aforementioned extensions of our results.